NOTES ON PAIR CORRELATION OF ZEROS AND PRIME 

NUMBERS 



D. A. GOLDSTON 

These notes are based on my four lectures given at the Newton Institute in April 2004 during 
the Recent Perspectives in Random Matrix Theory and Number Theory Workshop. Their purpose 
is to introduce the reader to the analytic number theory necessary to understand Montgomery's 
work on the pair correlation of the zeros of the Riemann zcta-function and subsequent work on 
how this relates to prime numbers. A very brief introduction to Selberg's work on the moments 
of S(T) is also given. 

1. Introduction and Some Personal History 

In 1973 Montgomery's paper [2ZJ, "The Pair Correlation of Zeros of the Zeta 
Function" appeared in the AMS series of Proceedings of Symposia in Pure Mathe- 
matics, and a new field of study was born — slowly. I first came across this paper in 
1977, and was probably the only person at Berkeley to read it. Most zeta- function 
people (as some of us refer to ourselves) recognized the importance of this work 
and the new phenomena discovered, but it was not clear what to do next. At 
first, the main interest was in using Montgomery's conjectures to refine the classi- 
cal results on primes obtained assuming the Riemann Hypothesis. Gallagher and 
Mueller ^2] wrote an important paper on this in 1978, followed by further results 
from Heath-Brown [21]. In 1981 I wrote my Ph.D. thesis on this topic. A few years 
later Montgomery and I |18| obtained an equivalence between the pair correlation 
conjecture and primes. However, this work attracted little attention — probably 
because the results were obtained using Montgomery's conjectures. Then, in the 
early 1980's everything changed: Odlyzko pH] computed statistics on the zeros and 
convinced even the most skeptical that after almost a century of intensive study a 
totally new, unsuspected, and fundamental property of the zeta-function had been 
discovered. The field has since had a flood of activity, with the generalization of 
Montgomery's work to higher correlations by Hejhal j2H| and Rudnick-Sarnak 32 , 
the interpretation of these results in terms of mathematical physics by Berry and 
Bogomolny-Keating the function field case of Katz-Sarnak the random 

matrix model for moments of the zeta-function of Keating and Snaith culminating 
in 0], and a profusion of new work. 

In these notes, I will discuss Montgomery's results and their relations to primes. 
As a unifying tool, I will use Montgomery's explicit formula .27 to prove a number 
of later results that were originally obtained by other methods. This approach was 
first made use of in part of my Ph.D. thesis, and was based on a suggestion of 
Montgomery in a letter. At that time Heath-Brown had just finished his paper 
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which covered the same ground, and I saw no need to publish this material beyond 
the summary that appeared in |15| . My goal, in line with the emphasis of the 
workshop on reaching out to beginners in the field, is to provide some of the main 
ideas used without technicalities and at the same time supply simple details which 
would be accepted without comment by experts. I have intentionally left out many 
things to keep these notes focused. The last section on Selberg's theory of S(T) 
and log£(s) is somewhat different from the previous ones, and I have decided to 
state only the main results and present a few of the ideas that are used. 

I would like to thank Andrew Ledoan for the many improvements he suggested 
for these notes. 



2. Basic Facts and Notation 

Following Riemann, we use the complex variable s = a + it. The Riemann 
zeta- function ((s) is defined, for a > 1, by either the Dirichlet series or the Euler 
product 

m cm -£i-n ('-£)"'■ 

71=1 p V 1 ' 

Here p will always denote a prime, so the product is over all the prime numbers. 
To extract information about primes from the Euler product, we compute the 
logarithmic derivative of the zeta-function and use the power series for — log(l — z), 
to obtain, for a > 1, 

(2 - 2) c (s) := W) = ds logC(s) = as\ ^ ^ ^) = ~ 2. — > 

3 ' v ' m=l p n=\ 

where the von Mangoldt function A(n) is given by 

(2.3) A(n) = 

The Chebyshev function ip{x) is the counting function for A(n) given by 

(2.4) Mx) = A(n). 

Because of the simple relationship with the zeta-function, it is preferable to use A(n) 
in place of the indicator function for the primes, and ip( x ) i n place of the counting 
function ir(x) for the number of primes up to x. If needed, one can usually recover 
7t(:e) from tp(x) by simple arguments. The Prime Number Theorem (PNT) states 
that as x — > oo 

x 

(2.5) ip(x) ~ x, or ir(x) ~ . 

logx 

The PNT with the error term obtained by de la Vallee Poussin in 1899 is, for a 
small constant c, 



logp, if n = p m , p prime, m > 1, 
0, otherwise. 



(2.6) ip(x) = x + O [xe- c ^* 
which on returning to tt(x) gives (c may differ from equation to equation) 

(2.7) tt(x) = h(x) + O (a;e- c ^ /I °^) , 
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where the logarithmic integral 

J 2 !ogu 

is the actual main term in the theorem. For the error term above, we have for any 
constant A > 

(2.8) e -osA^^_}__ 

{\ogx) A 

Here, the Vinogradov notation <C is equivalent to "big oh" of the right-hand side. 
This estimate is freely used when the PNT is invoked. 

We frequently need the Dirichlet series for £(s) _1 , which from the Euler product 

is 

1 ^ M ( n ) 



(2.9) 



CO) t=f, nS 



for a > 1, where the Mobius function is defined by /z(l) = 1 and 

(2.10) „(n) = { i~ 1)m ' I n = PlP2 ' ' ' Pm > P ^ di8tinCt ' 

[0, up \n, some p. 

The zeta- function has a simple pole with residue 1 at s = 1, trivial zeros at 
s = —2n, n = 1, 2, 3, . . ., and complex zeros 

(2.11) p = f3 + ij, 0</3<l. 

The inequality j3 < 1 is the key result needed in the analytic proofs of the PNT. 
The zeros are positioned symmetrically with the real line and the "|-line" | + it, 
so that p, p, 1 — p, and 1 — p are all zeros. The Riemann Hypothesis (RH) is the 
conjecture that (5 = ^, and thus p = i + vy. For example, the first 6 zeros in the 
upper half of the critical strip are 

- + il4.13472 . . . , - +i21.02203..., - + z25. 01085 ... , 

(2.12) 2 2 2 
-+i30.42487..., -+i32.93506..., - + i37.58617 . . . . 

To count the number of complex zeros in a given region, we define 

(2.13) n(T) = |{7:0< 7 <T}|, N(T) = n(T + 0) + n(T - 0) ^ 

where \A\ denotes the number of elements of the set A. Note that N(T) counts any 
zeros 7 = T with weight one-half, which arises naturally in the theory; therefore we 
always use N(T) in preference to n(T). The Riemann-von Mangoldt formula for 
iV(T), obtained by applying the argument principle to C and using the functional 
equation (see [23 > ESDi is 

(2.14) Ni T) = ^-log^-+ 7 - + R(T) + S(T), 
where 

(2.15) R(T) « i 
and 

(2.16) S(T) = i argC Q + iTj « logT. 
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In fact, R{T) is continuous, differentiable, and can be expanded into a series in 
inverse powers of T. We see that H2.14J) provides a remarkably precise formula for 
the number of zeros up to height T, with the finer details of the vertical distribution 
of zeros wrapped up in the study of S(T). In particular we have 

(2.17) iV(T)~^logT. 

Z7T 

Another consequence of (|2.14ll - (|2.1ti|) which we make frequent use of is the sharp 
estimate 

(2.18) N(T + 1) — N(T) = J2 KlogT. 

T<~/<T+1 

3. Explicit Formulas 

To study the relationship between the zeros of the zeta-function and primes you 
need to be able to work with explicit formulas. There are many such formulas but 
the best known is the Riemann-von Mangoldt explicit formula, which states that, 
for x > 1, 



(3.1) Mx)=x~J2 — -log27r-^lo g ri-i 

p P \ x 

where "4>q{x) = \(ip{x + 0) + ijj(x — 0)). By l|2.14|l . the sum is not absolutely 
convergent and the terms are added with p and p grouped together. The explicit 
formula also contains this information, since on taking x — > 1 + and letting x = e" 
we see that 

— = lo g - + 0(l) as u^0+. 
p 2 it w 

p 

For applications we usually use the truncated version of l|3.1f) . 

(3.3) 4>(x)=x- ]T - + o(| ; (log a; T) 2 )+o('(log a; )min(l,^)), 

| 7 |<T " ^ '' X '' ' 

where | |x| | denotes the distance of x to the closest integer, and the last term reflects 
the jumps of ip( x ) a t the primes and prime powers. As an example of an application 
of H3.3JI . assuming RH we have 

x p x? 
— < T-r, 

p rrl 

and so by l|2.12|l and l|2.18|l . with [x] denoting the integer part of x, 

| 7 |<T 1 " 1<7<T n=l n<7<n+l n<2T 

Thus, taking T — x in (|3.3II we have 

(3.4) ip(x) = x + O (xi (log x) 2 ) . 

It can also be proved that this estimate implies RH, and therefore is equivalent to 
the RH. 1 Equation l|3.4|l is due to von Koch in 1901 and has never been improved. 
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We next apply (|3.4|1 to the problem of large gaps between primes. Let p n denote 
the n-th prime number. The highest power of a prime < x is the largest k for which 
2 k <x,sok= [log 2 x]. By the PNT, 

■p<x 2< m<log 2 x p m <x 

= ^logp+ logP + C>(7r(a;*) (logs) 2 ) 

p<x P<Vx 



p<x 



\ogp + O (x^ 



Thus l|3.4ll continues to hold when we only sum over primes. For 1 < h < x, we 
have by l|3.4|) and differencing that 

\ogp = h + O (x^ (\ogxY 

x<p<x-\-h 

On taking h = Cx^ (\ogx) 2 , with the constant C being larger than the implicit 
absolute constant in the error term, we conclude that the sum on the left is positive 
and h, and thus the interval (x, x + h] must contain 3> primes. If p n is the 
first prime in [x,x + K), then 

(3.5) p n+ i -p n < /i<p„^(logp„) 2 . 

An explicit formula that also exhibits the close connection between zeros and 
primes is the Landau formula, which states that (for x fixed) as T — ► oo 

(3.6) J2 x p = -^- + 0{\o g T). 

0<7<T 

Here we define A(x) to be zero for real non-integer x. Formally this is obtained by 
differentiating (|3.1|l with respect to x. The exponential sum over the zeros encodes 
the information on which integers are primes or prime powers. Equation i|3.6|) is 
not particularly useful, but Fujii 10 and independently Gonek |19| have developed 
uniform versions which can be used in applications. 

An explicit formula of at least historic interest is the Cramer explicit formula, 
which states that for Im(z) > 



e pz - — 



A(n) / 1 1 



7>0 n=2 



2ni ^ n \ z — log n log n 



1 ^ A(n) ( 1 1 



^2 y\ 2wi ^--^ n \z + log7i logn 

1 7 + log27r\ / l\ 1 T' 
w 1 + - 1 



4 2ni J \ z 

l 



z 



e sz log\((z)\ds 



2m T \2iriJ 2 



1 f°° t dt 



2iri J 2mz J e* — 1 1 + z 

On taking z = — logr + iy, < y < 1, and letting r — > oo Cramer [7] proved that 

_o(-lotLT+iv) M n ) ( 

(]og£)* + y*J " ' " Vlogr, 



(3.8) — 27rRe £ log T+ly) = £____( ^—^h ~ \-* + 



7>0 n=2 
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He used 1(3. 8|) and related formulas in a series of papers starting in 1920 to prove 
results on primes. One such result is that on RH 

(3.9) p n+ i - p n p n ? \ogp n . 

This only saves a logarithm over the trivial use of 1(3.4(1 in 13.5(1 but is the best result 
known on RH. We will later assume a much stronger hypothesis and only improve 
(|3.9|) by a half-power of a logarithm. On the other hand, Cramer conjectured [7] 
that the gaps between consecutive primes are always much smaller than this size. 
Recent work indicates that Cramer's original conjecture may be slightly too strong, 
but all evidence still suggests 

(3.10) Pn+1 -p n < (logp„) 2 . 

At one time I had a fondness for Cramer's formula and made use of it in my thesis, 
but I later decided that nothing was to be gained by its use except complicated 
arguments. The proof of l|3.9|l . for instance, can now be done from a smoothed 
version of (|3.1() in just a few lines. However, there have been a number of recent 
papers on the structure of Cramer's formula (see |24|). 

Most of these explicit formulas are based on evaluating the contour integral 



, > {s)K z (s)ds, 

where the kernel K z (s) is a meromorphic function. Frequently K z (s) = K(s + z) 
or K z (s) — K(zs). If c > 1 the Dirichlet series for ^-(s) converges absolutely and 



C 

i pc-\-ioo 

n)K z (n), K z (n) = — K z {s) n - S ds. 

■ > ^ m J c—ioo 



One then obtains an explicit formula by moving the contour to the left, thus en- 
countering poles at s — 1 and at the zeros of ((s), as well as any poles of K z (s). 

Another explicit formula frequently used is the Weil explicit formula, which con- 
tains a general weight function and has the advantage of allowing the relationships 
between terms to be explicitly exhibited. Our approach here, however, is to use 
explicit formulas only as tools for studying zeros and primes. Therefore we will 
take the opposite path and stay specific. The formula we will base our work on is 
due to Montgomery |27| . 

Proposition 1. Assume the Riemann Hypothesis. For x > 1, 

9 i-» ST xn - _ Hn)a n (x) , 2X 1 -* 
^ 1 + (t - 7) 2 " 2 - 



(3.11) 



+ (t- 7 ) 2 tl nlt (i + <*)(!-« 

+ ari(log(|t|+2)+0(l)) + 



t\ + 2 



wh 



ere 



(3.12) a n (x)=mm((^)\(^y 

Proof. This proposition is proved by using an explicit formula from Landau's 1909 
Handbuch [2H|, which states that (unconditionally) for x > 1, x ^ p m , 

(3.13) y^H = _c: (s)1 x 1 -* ^*"- s ,^*- 2n - s 



C 1 — s ^— ' n — s ^— ' 2n ■ 

p p n=l 
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provided s ^ 1, s^p, — 2n. 2 Rewriting we have 

C / \ ~* A(n)\ x x 



(3-i4) E^ = - s (> + £ 



-2n 



p — s V C n s J 1 — s ^— ' 2n + s 

p n<x n=l 

This equation holds independently of RH, but assuming RH we have p = ^ + ij. 
Letting s = § + it and using (|2.2|) , the above equation simplifies to read 

i V^* x%1 it A(n)a„(x) x ^ x~ 2 " 



-x 2 > ; r = x 



On the other hand, if s = — h + ££ in (|3.14|) we have 

^l-i(*-7) ^ C V 2 



| - it ^-J 2n- 4 + 

z n— 1 ^ 



Subtracting the latter from the former and using 

C ( i 



. v ., .it) =-log(|t|+2) + 0(l), 

which follows easily from the functional equation, we obtain the proposition. By 
continuity the values x = l,p m no longer need to be excluded. The role of RH in 
Proposition 1 is notational. Recently, a new notation has emerged which is very 
convenient. We write the complex zeros of the zeta function as p = h + ij, 7 G C, 
so that 7 is complex when that zero is off the ^-line. Thus, the RH becomes the 
statement that 7 is real. With this notation we see that the proof is unchanged, 
and Proposition 1 holds unconditionally. We will not make any further use of this 
notation since the size of the terms in our sums over zeros become important and 
the RH is often needed. 



4. Montgomery's theorem 

We first examine Montgomery's explicit formula heuristically and see what each 
term means. The weight in the sum over zeros concentrates the sum to zeros in a 
short bounded interval around t, and therefore behaves similarly to 

E 

*<7<t+l 

By ()2.18(l , if this sum is substantially smaller than log t then we will have detected 
cancelation from x %1 . If x = 1 or is close to 1 no cancelation can occur, and 
this is reflected by the term a; -1 / 2 log(|t| + 2) in The sum over primes is 

concentrated around x, and therefore behaves similarly to 

v- A(n) 



2 If s = we get 13.11 . Landau used 13.131 to prove Riemann's original explicit formula for 
7r(x). 
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The expected value of the original sum over primes is obtained by the PNT and 
equals the remaining term 

2x 1 - it 



(i +«)(!- 

How does one extract information from Montgomery was interested in 

studying the distribution of the differences of pairs of zeros, and for this it is clear 
one needs to square the absolute value of the sum over zeros. It would be nice to 
be able to obtain this distribution in an interval of length one around t, but the 
pointwise dependence on t in the Dirichlet sum over primes is intractable. To cir- 
cumvent this problem, we also integrate with respect to t to obtain our distribution 
in a longer range. To this end we consider 



dt. 



Since the weight in the sum will be small when \t — 7I is large, which is the case 
over most of the integration range unless < 7 < T, we may restrict the sum to 
this range with a small error. With the sum restricted to the zeros < 7 < T, 
we may extend the integration range to (—00, 00) with a small error. Using l|2.18[l . 
Montgomery showed 



St 



r*7 



+ (* - 7) 2 



dt = 



0<7<T 



r'7 



dt + 0{{\ogTf) 



Multiplying out the integral on the right-hand side, we find 



£ r 

0<7<T 



(i-7) 2 



dt = ? E ^ (7 " 7 'M7 - 7'), 



where the weight 
(4.1) 



w(u) 



0<7,7'<T 

4 



A + u 2 

is obtained on evaluating the integral either by residues, convolution, or otherwise. 
We thus define for x > 

2 



(4.2) F(x, T) 

Then 
(4.3) 
and 
(4.4) 



E ** r -A«(7 - V) = - / £ - 

~~'-?T J-OO Q <7 < T 



X 



*7 



0<7,7'<T 



F(x,T)>0, F(x,T) = F 



+ (* - 7) 2 



dt. 



F(x,T) 



E T 



dt + 0{{\ogTf). 



The next step is to use Proposition 1 to evaluate F(x,T). Denoting (|3.11fl by 

L(x,t) = R(x,t), 

we have just shown that 



(4.5) 



/ \L{x,t)\ 2 dt = 2ttxF{x,T) + 0(x(logT) 3 ). 
Jo 
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For R(x,T), we compute the mean-square of each term. For the Dirichlet series, 
we use a standard mean value theorem of Montgomery and Vaughan |29j . which 
states that 

/>T | 00 2 

(4.6) 



Hence 



/ * = £K| a (T + o(»)). 

J ° n=l n=l 

00 

dt = \A(n)a n (x)\ 2 (T + 0{n)) 



A(n)a n (x) 



E 

n=l 

= xT(\ogx + 0(1)) + O (x 2 logx) , 

by Stieltjes integration and the PNT (with remainder). The remaining terms are 
elementary: 

2X 1 - 11 

dt < x , 



a+im-it) 



"* (log(jtj + 2) + 0(1)) 1 2 dt = |((log'T) 2 + O(logT)) 



and 







r 


x- 2 


10 


1*1 + 2 



dt<£Lx 



We thus have two main terms, the Dirichlet series term for (logT) 3 / 2 < x < o(T) 
and the term log(|t|+2) which dominates for 1 < x < (logT) 3 / 4 . In the intermediate 
range all terms are o(xT log T). By the Cauchy-Schwarz inequality, the largest term 
among these provides the main term in an asymptotic formula. Therefore, 



\R{x,t)\ 2 dt = xT(\ogx + o(logT)) + 0{x 2 logx) + -(log!Zy(l + o(l)), 

x 



and we conclude that 
T 

(4.7) F(x,T) = — logx + o(T log T) + 0(x logx) 
Following Montgomery, we set 
(4.8) 

and normalize by defining 



2irx- 



-(logT) 2 (l + (l)). 



x = T° 



(4.9) F(a)=F{a,T)= ( ^ log T 



T 



^2 r , -o(7-7') 1£ ,( 7 _ y). 

0<7,7'<T 

Thus we have arrived at Montgomery's theorem. 

Theorem 1. Assume the Riemann Hypothesis. Then F(a) is real, even, and non- 
negative. Further, uniformly for < a < 1 — e, we have 

(4.10) F(a) =a + o(l) + (1 + o(l))T~ 2a logT. 

The error term 0(xloga;) in (|4.7|) can be improved to 0(x) by using a sieve 
bound for prime twins |18| . which shows the theorem holds for 

(4.11) 0<a<l. 

A detailed analysis of the above proof has recently been done by Tsz Ho Chan, 
with all second order terms obtained. 
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5. Application to simple zeros and small gaps between zeros 

The function F(a) is useful for evaluating sums over differences of zeros. Let 
r{u) £ L , and define the Fourier transform by 

/oo 
r(u)e(au) du, e{u) = e 27rm . 
-CO 

If f (a) E L\ we have almost everywhere 

/oo 
r(a)e(— ua) da. 
-oo 

On multiply H4.9() by r(a) and integrating, we obtain 

(5.3) Yl r ((7 " 7') ^) «>(7 - V) = Y logT r r(a)F(a) da. 

0<7,7'<T ^ ^ ^ " , '~ 00 

Using Theorem 1, we can evaluate the right-hand side provided f[a) has support 
in [—1,1]. Thus, we can evaluate sums over differences of zeros on the class of 
functions whose Fourier transforms are supported in [—1, 1]. Using the Fourier pair 



, , . / sin ttXu \ 2 - , s 1 , 

(5.4) k(u)=[ , fc(a) = T max 1-V.O ^>0 

\ ttXu J A V A 

we have for < A < 1 

2 



sin^-^logr). 



u>-x)-H- ogT 

f (l-f)(« + T--logT) da)^logT 



2 
A 

1 A\ T 



(5.5) 



logT. 
A 3 y 2tt B 

This result has an important application to simple zeros of C( s )- 

Theorem 2. Assume the Riemann Hypothesis. At least two thirds of the zeros of 
the Riemann zeta-function are simple in the sense that as T — > oo 

(5.6) N S (T):= £ 1 > (| - 0(1)) N(T). 

0<7<T ^ ' 

p simple 

Proof. The sum in (|5.5(l over pairs of zeros counts distinct zeros weighted by 
their multiplicity. Thus a double pole gets counted 4 times, a triple zero 9 times, 
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etc. Denoting the multiplicity of p by m p , we have 

E m p= E 1 

0<7<T 0<7,7'<T 
7=7' 

/ S in(A( 7 - 7 Ql 0g T) \ 2 
-o< 7 ^<rl f(7-7')logT ) W{1 7) 

<(l + (l))(i + ^)^lo g T. 

Choosing A = 1, we have 

(5.7) E m ^(t +e )^ T - 

0<7<T V / 

But _^ 

E^E ( 2 -»»p). 

0<7<T 0<7<T 
p simple 

and applying i|2.17[l completes the proof. 

It is possible to make very small improvements in the value | in Theorem 2. It 
would be a major advance to be able to prove that almost all the zeros are simple, 
even on RH. Conrey, Ghosh, and Gonek |S] have proved using a different method 
that assuming RH and the Generalized Lindelof Hypothesis, 

N S (T)> (g-e) N(T). 

Montgomery also proved that there are gaps between zeros closer than the av- 
erage. He used the transform pair i|5.4[l with their roles reversed to obtain 

liminf ( 7n+ i - 7„)— f^ 1 < 0.669 .... 

n — >oo Z7T 



Consider the Fourier pair 



1 \ « , , / , sin27r|u| 



(5.8) h(u)=[ J , h(a)=ma X [l-\u\+ 1 ' ,0 

\ iru J \1 — it / V 27r 

where h(u) is the Selberg minorant of the characteristic function of the interval 
[—1,1] in the class of functions with Fourier transforms with support in [—1,1]. We 
prove 

Theorem 3. Assume the Riemann Hypothesis. We have 

(5.9) liminf ( 7n+1 - 7n ) < 0.6072 .... 

Proof. Take r(u) =h(%). Then r(u) is a minorant of the characteristic function 
of the interval [—A, A]. Thus 

E m p+ 2 E ^ E ^((7-70^W-y) 



0<7<T 



logT^ \h(\a)F(a)da. 



12 



D. A. GOLDSTON 



Assume A < 1. Since the integrand is positive we obtain a lower bound by decreasing 
the integration range to [—1,1]. We can assume 

J2 m p ^—\ogT, 

0<7<T 

since otherwise we would have infinitely many multiple zeros and the theorem holds 
for this reason. Thus 

^ 1 > f~ -ej ^-logT^A- 1 + 2X J ah(Xa)da\. 

By an easy numerical calculation, we find that the right-hand side is positive for 
A > 0.6072 . . ., which proves the result. 

By a different method (on RH) , Montgomery and Odlyzko jSD] improved on this 
result and obtained the upper bound 0.5179. Conrey, Ghosh, and Gonek [1] later 
replaced this by 0.5172. 

6. Montgomery's Conjectures 

What if a > 1? It is not difficult to see from the proof of Montgomery's theorem 
that for x > T 

(6.D f {x , T) = JL f I £ - ( "tn -a 2 dt + o(TlogT) - 

2nx J° '^1 11 \.l + lt )(l- lt ) 

We saw that the diagonal terms in the sum contribute ^ logx, while the expected 
value term contributes ex. On the other hand, we have the trivial bound 

(6.2) ^(x,T)<F(0,T)~|-log 2 T, 

where the last relation follows from Theorem 1 (or unconditionally from l|2.14|l ). 
Thus F(x,T) never gets as large as x for x ^> T(logT) 2 , and therefore the off- 
diagonal terms in the sum over primes must almost perfectly cancel the expected 
value term. 

Montgomery proceeded by multiplying out the integrand in l|6.1|l and integrat- 
ing term by term. For the off-diagonal terms, one needs to assume the Hardy- 
Littlewood fc-tuple conjecture |20) for 2-tuples (or prime pairs) with a strong error 
term. This conjecture states that for < k < N 

(6.3) ^A(n)A(n + fc) = 6(i)jV + 0(jV5+j 
where 



i<iV 



2C 2 n(^ — \ j > if is even, n ^ 0; 
(6-4) 6(k)={ P \k Kp ~ 2J 

0, if k is odd; 



and 

(6.5) C 2 = J] ( 1 

p>2 



1 



(p-iy 
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Montgomery stated that this conjecture "would allow us to carry out our program" 
for x < T < x 2 ~ e and obtain 

F(x,T)~ JlogT. 

Further, there is no reason to expect any change in behavior for bounded a > 2. 
On this basis Montgomery made the following conjecture. 

Strong Pair Correlation Conjecture (SPC). For any fixed bounded M, 

(6.6) F(a) = l + o(l), forl<a<M. 

A question left unanswered by (|6.6(l is the rate at which the function M = M (T) 
tends to infinity. 

With regard to Montgomery's heuristics for making SPC, the argument that 
(|6.3() implies SPC in the range 1 < a < 2 — e was carried out by Bolanz in a 
1987 Diplomarbcit (in 131 pages). 3 At the cost of slightly weaker but acceptable 
error terms, one can greatly simplify Bolanz's proof by smoothing l|6.1[) (see |16p. 
In section 9, we will see that one can go further by never multiplying out the 
integrand in (|6.1|) . 

With SPC and Theorem 1 we can now evaluate almost any sum over differences 
of zeros. In particular, Montgomery was lead to make the following now famous 
conjecture. 4 

Pair Correlation Conjecture (PCC). For any fixed (3 > 0, 

,6,, w( T.ffl== (|>o S rf E ^j\.(S^y du . 

0<7,7 <T 
0<7'-7<^ 

The density here for the number of pairs of zeros within /3 of the average spacing 
between zeros is where the connection with random matrix theorem first occured. 

One can now replace Theorems 2 and 3 with completely satisfactory results. 
From the PCC we immediately see that the following conjecture is true. 

Small Gaps Conjecture (SGC). We have 

(6.8) liminf (j n+ i - 7«) l0 f 7 " = 0. 
We also have 

Simple Zeros Conjecture (SZC). We have 

(6.9) N*(T):= (^logT) ]T m p ~ 1. 

^ n ' 0<7<T 

Technically this is a conjecture on the average multiplicity which implies almost all 
the zeros are simple, but there is no need to make this distinction here. Another 
related conjecture that follows immediately from the PCC is that 

(6.10) N(T,j3) = o(l), as/3^0+; 

3 This thesis only proves the result in the range x < T < x5" f , but Bolanz extended the result 
to the wider range (written communication). 

4 The SPC conjecture doesn't explicitly say anything about pair correlation, and was often not 
distinguished from the PCC. It is also sometimes called Montgomery's F(a) conjecture. 
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this conjecture and SZC together are sometimes refcrccd to as the Essential Sim- 
plicity Conjecture (ESC). Of course, the PCC itself implies a stronger repulsion 
between zeros: as (3 — > + , 

(6.11) N(T,0) </3 3 . 
We now prove the following result. 

Theorem 4. Assume the Riemann Hypothesis. SPC implies PCC and SZC. 

First, we need a simple consequence of Theorem 1 to handle the range when 
a > M. 

Lemma 1. Assume the Riemann Hypothesis. We have uniformly for any B, pos- 
sibly depending on T , 

rB+l 

(6.12) / F(a)da<3. 

J B 

Proof. With B = C — A , we have 

f C+h f C+l 

J F(a)da<2j (l - \a - C\jF(a) da 



c-i 



<^— Y f sin(|( 7 - 7 Ql 0g T) \ 2 



by 

Proof of Theorem 4- For SZC, we repeat the calculation in l|5.5[) but now assume 
A > 1 and use SPC for that range to find 

The result now follows on letting A — > oo. 

To prove the PCC, we use the Fejer kernel from (|5.4|l and apply i|5.3[l to get 

na) ( s j^py\ 2 da 

\ nfia J 



(6-14) V^ l0gT ) ^ £ 

0<7,7 <T 

h'-f\<m 





(7-y)logT 



27T/3 



w(j - 7') 



where the error term comes from removing the factor 10(7—7'). -By SZC, N*(T) ~ 1. 
We now evaluate the left-hand side using Theorem 1 in the range \a\ < 1, SPC in 



V(T) + JjV,^ + W+/3) 
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1 < \a\ < M, and Lemma 1 in |a| > M. On letting M — > oo, we have 

(6.15) f N{T,u)du^^- + f\u-f3)(^^] du. 

Jo 2 J \ nu / 

Since 

I rP i rP+h 

- N{T,u)du< N(T,(3) < - N{T,u)du, 

h J/3-h h Jp 

we obtain the PCC on differencing l|6.15[) . 

7. Gallagher and Mueller's Work on Pair Correlation 

A few years after Montgomery's work, Gallagher and Mueller ^2] proved a num- 
ber of interesting results on pair correlation. Their starting point is the counting 
function N(T,(3) in 1)6. 7|) . but rather than assuming it satisfies the PCC they as- 
sumed 

rP 

(7.1) N(T,{3)~ / l-/j,(a)da, 

Jo 

uniformly for Q < P$ < (3 < (3\ < oo, as T — > oo, where fj, is a real, even, continuous, 
L 1 function. Thus they assumed an asymptotic density function for pair correlation, 
where /i(a) measures the deviation from a uniform distribution corresponding to a 
totally random distribution of zeros. They then proved the following result. 

Theorem 5. With N*(T) given in (|6.9() . we have 

[i(a)da - N*(T). 

-oo 

In particular, PCC implies SZC. 
From this we see that 

/oo 
n(a) da > 1, 
-oo 

which shows that if the zeros of the zeta-function have an asymptotic pair correla- 
tion density, then the zeros must repulse each other somewhat. Further evidence 
of this was later obtained by Gallagher [TT] . 
That the PCC implies SZC follows from 

sm ira \ 

da = 1. 

ira J 

The notable feature here is that this result holds unconditionally. One can obtain 
this result on RH by first using Theorem 1 to prove (|5.5|) . and then evaluating 
the off-diagonal terms in the sum over zeros by partial summation with N(T, (3) to 
determine the diagonal terms N* (T) . Gallagher and Mueller replaced Theorem 1 
by a result of Fujii [5j (also obtained by Selberg) on S(T). Let 

(7.3) R(T,h)= [ (S(t + h)- S{t)f dt. 

Jo 

Fujii proved that 

(7.4) R(T, h) <C Tlog(2 + hlogT) if — !— < h <C 1, 

logT 
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and 

(7.5) R(T, h) 5-logf/i log T) if hlogT -► oo, h < 1. 

7T 2 



Proof of Theorem 5. We have 
(N(t + h) - N(t)) 2 dt -- 





h log T 

(This is l|6.14|l from a different perspective.) By 1|2.14|1 . the left-hand side is also 

On substituting N(T,u) from l|7.1|l and letting hlogT — > oo and h — > so that 
(|7.5|l applies, the theorem follows. 

Gallagher and Mueller proved that for h = -^rp 

/oo 
min(H, (3)n(a) da, 
-oo 

a result essentially equivalent to PCC. 

Gallagher and Mueller also studied some consequences of 17.111 for primes. In 
particular they proved that the error in the PNT can be improved on assuming 
(ITU) and RH to 

(7.7) ij)(x) = x + o (x2(log:z;) 2 ) , 

and obtained an asymptotic formula for a weighted second moment for primes 
in short intervals first studied by Selberg Their proof is quite complicated, 

since the approach in using l|7.1|l requires partial summation to evaluate sums over 
differences of zeros, introducing many complications to handle the "edges" of the 
summation. An interesting consequence is a form of Theorem 1 obtained for n(a). 
Assuming RH and also SZC, Gallagher and Mueller proved 

(7.8) A(«)=i-H. M<i- 

The PCC density agrees with this, and has jj,(a) = elsewhere. 

Related to this, there is an alternative form of the PCC which has been found 
useful when generalizing to higher correlations. Starting from l|5.3|l . and supposing 
r(a) has support in [—1,1], we have by Theorem 1 on RH that 



r{a)F{a) da = r(0) + / r(a) (F(a) - 1) da 

) J — OO 

(7.9) ~ r (Q)+r(0)- J (1 - \a\)f{a) da 

= r(0) + jTr(a) (l - (^) 

by PlancherePs formula. The second line is still true if F(a) ~ 1 for \a\ > 1 even 
if f(a) does not have support in [—1, 1]. It is no accident that the PCC density 
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occurs in the integrand. The conclusion is that assuming RH, Theorem 1 implies 
that 

(7.10) yi K (7_7,)i lrH 7 ~ 7,) ~ r(0)+ r r{a) f 1 - (^F) 2 ) da > 

0<7, 7 '<T " J -°° ^ n ' 

for all r with f having support in [—1,1]. Moreover, PCC is equivalent to the 
conjecture that l|7.10|l holds for all test functions r in some dense subset of L 1 . 
Here, the factor 10(7 — 7') may be removed, if desired. 

8. Heath-Brown's Results on Primes 

In |21| Heath-Brown proved a number of results on primes using Montgomery's 
F(a) function. By l|tj.2fl . the trivial bound for F(a) is 

(8.1) F(a) < (l + o(l)) logT. 

Heath-Brown showed that any improvement in the order of magnitude of this bound 
would have important implications for primes. First, he proved that the improve- 
ment in the error in the PNT 1|7.7(1 also holds if one assumes RH and F(a) — o(log T) 
uniformly for 1 < a < M, for any bounded M. It should be pointed out that fur- 
ther improvements in the error depend not only on the size of F(a) but also on the 
growth of M(T). 

Heath-Brown next proved a number of results on gaps between primes, which 
take their strongest form if we assume 

(8.2) F(a) < 1, 

for various ranges of a. With regard to Cramer's bound (|3.9() . he proved that, 
assuming RH and lj8.2|) . for a in any small interval around a = 2 



(8.3) p n+ i -p n < ^/pnlogpn. 

Assuming F(a) ~ 1 in this range one can improve (|8.3|) on RH to little oh |22| 5 . 
(This also follows from a result in the next section.) Next, assuming H8.2J) for 
1 < a < 2 + e and RH, 

/n a\ / s xlogx 

(8-4) 22 (P»+1-Pn)«— 

Pn 

Pn+l-Pn>H 

This becomes non- trivial as soon as ~ * 00. On integrating with respect to H , 
we obtain 

(8.5) Y (Pn+l ~Pn) 2 < x{\0gxf. 

p n <x 

Previously Selberg |34| . improving on earlier work of Cramer 0, obtained these 
results on RH alone with an extra log a; in each bound. Finally, Heath-Brown 
proved on RH and F(a) ~ 1 that in any interval around a = 1 

(8.6) liminf ( -0, 

jwoo Y l0gp„ / 

so that there exist small gaps much smaller than the average gap between primes. 
This result can be made to depend on the size of the error term in the asymptotic 
formula for F(a) for a in a neighborhood of 1. If the error term is a logarithm 



^The unusual order of the listed authors was due to a typo in the manuscript. 
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smaller than the main term, then one actually gets that there are infinitely often 
primes a bounded distance apart. 

In the next section, we shall prove these results by following a method that is 
structurally different but fundamentally the same as Heath-Brown's arguments. A 
very useful idea of Heath-Brown is the following bound for the sum over zeros <|3.6ll . 

Theorem 6. For T > 2. 

(8-7) 

Note that this becomes non-trivial as soon as our bound for F is non-trivial. 
Proof. We have 

' e-^du. 



E 




0<7<T 





/DO 
E 
-OO n ^_ ^ 



0<7<T 



This is related to l|4.2[l by Plancherel's theorem but may be verified directly. By 
Gallagher's inequality [5], 



|/(0)| « / |/(u)|d«- 



\f'{u)\du 



for f EC 1 . Then with 



we obtain 



/(«) = 



E < 

0<7<T 



E 


2 .] 


0<7<T 





E 



0<7<T 
1 



df/ 



E ■ 

0<7<T 



_9_ 

9u 



E ■ 

0<7<T 



In the first integral on the right we insert the weight e -2 '"' and extend the limits 
of integration to (—00,00) to see that, by (|8.8() . this is bounded by F{x,T). To 
complete the proof, the second integral is handled similarly following an application 
of the Cauchy-Schwarz inequality and partial summation. 

9. Equivalence between SPC and Primes 

In HBJ Mont gomery and I proved the following equivalence between the SPC and 
the second moment for primes in short intervals. 

Theorem 7. Assume the Riemann Hypothesis. IfO < B\ < B2 < 1, then 



(9.1) 



I{x,5) 



(tp ((1 + 8)x) - if>(x) - Sx) dx 



i^logi 



holds uniformly for X B2 < 5 < X Bl provided 



(9.2) 
holds for 



X Bl (\ogx)-'- i < T < X B2 (\ogx) 



F(x,T)~ ^ log T 
Zn 



B 2 1 
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Conversely, if 1 < A% < A2 < 00, then (|9.2J) holds uniformly for T Al < X < T A2 
provided that 1)9.1)1 holds uniformly for 

X~^(logxy 3 <5< X~^(logx) 3 . 

In particular, one can prove on RH that SPC is equivalent to 



(9.3) 



(ip(x + h)- ip(x) - h) 2 dx 



hXXog^ 



for 1 < h < X 1 ~ e , 6 where an argument of Saffari and Vaughan [33] is used to move 
from primes in the interval (x, x + 5x] to the fixed interval (x, x + h]. Results (18.31) 
- 1)8.6)) are consequences of Theorem 7. Further, it is a straightforward exercise to 
show that the twin prime conjecture in the form 1)6. 3J1 implies 1)9. ljl and 1)9.3)1 in the 
ranges X^ 1 < 6 < X~z~ e and 1 < h < X?~ e respectively, and consequently we 
again obtain that 1)6.3)1 implies SPC in the range 1 < a < 2 — e. Of course, the 
second moment for primes in short intervals 1)9.1)1 or 1)9.3)1 is a considerably weaker 
hypothesis and gives the full range for SPC. 

Proof of Theorem 7. We follow initially the analysis in [23 ■ Let us consider 
again Montgomery's explicit formula 1)3.11)1 but now aim towards obtaining a sum 
over primes in a short interval. This is usually done by differencing values of x but 
Montgomery showed me the following elegant approach. Let k, 5, and T be related 

by 



(9.4) 

so that 6 = i, and define 



(9.5) 



G K (t) = 




^ = 1 + 5 = 1 + - 



A(n)a n (x) 

n=l 



2X 1 



n 



it 



The Fourier transform of G K (t) is 

K 



G K {y) 



\y+ 



E 

Mil 



A(n)a„(a;) 



27TX 



it) 



K/1 



K/2 



it) 



dv 

v 



which has the desired (but weighted) sum over primes in a short interval. 
Parseval's identity, we have 



By 



(9.6) 



G K {t) 



dt 



G K (y) 



dy. 



Using 1)3.11)) to express G K (t) in terms of a sum over zeros with the remaining terms 
estimated as error terms and simplifying we find, assuming RH, 

(9.7) 



^ A(n)a n (x) 

y<n<y+it, 

Axk 2 



av(y) 



dy 

y 



sin 7|i 

Hf 



.17 



E 1 + (i _ 7 )2 



dt + O 



T 



6 If holds for this range of h it implies RH. 
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We abbreviate this equation as 

(9.8) L(x,T) = R(x,T). 

To prove Heath-Brown's results <|8.3[l and (|8.4|l from the last section, it is easy 
to see that, taking T = 

L(x,T)>^ (Pn+1-Pn) 

f<Pn<X 
Pn + l-Pn>H 

and, assuming (|8.2|) . 

i?(x,T)« -logT. 

Equation (|8.4|) follows from this, and (|8.3|l follows by taking only the last term in 
llOl . 

For the proof of Theorem 7 we would like to remove the weight a n {x) in L(x, T) 
and thus obtain an expression involving I(X, S). In view of 14.4fl . since k ~ 
i?(x, T) can be related to .F(x, T) through Abelian and Tauberian theorems. If one 
assumes an asymptotic formula for I(X, S) one then obtains an asymptotic formula 
for L(x, 5) which gives an asymptotic formula for R(x, S) and then a Tauberian 
theorem gives an asymptotic formula for F(x,T). The converse direction works 
similarly using an Abelian theorem. All the details may be found in ^Hj except 
how L(x, T) is related to I(X, 6), since the proof there proceeds from (|2.14(l rather 
than Proposition 1 . It took me a long time to figure out how to remove the weight 
a n (y) even though it is actually obvious. If y is small, then for y < n < y + ^ it 
is reasonable to replace n by y and thus replace a n (x) with a y (x) in (|9.7|) with a 
small error. Thus the weight is removed, and one finds that 

(9.9) ^, r) = toS p (9 ,^ + (iK), 

Since the integrand is non-negative, if we have an asymptotic formula for L(x,T) 
then a simple differencing argument will give an asymptotic formula for I{x, S). The 
converse is immediate. Here the error term is smaller than the main term when 
T < x < T 2 ~ e . To obtain the full range, rather than replacing a n (x) by a y {x), 
we use Stieltjes integration and the PNT with the RH error (|3.4|l to evaluate the 
sum over primes, and together with the Cauchy Schwarz inequality we find that 
the error term in (|9.9|) can be replaced by 

p(* (1 °y )4 )- 

This suffices for the full range. 

10. Selberg's theory of S{T) 

For more than 50 years, Selberg has been working on the distribution of values of 
logC(s) and related functions. In the early 1940's and he made major contributions 
on S(T) Further results for Dirichlet i-functions were obtained in |37| . 

Selberg has continued to work on these problems, and while he has lectured on his 
results, his next published paper on this subject |2U only appeared in 1992. In 
this already famous paper Selberg introduced the properties of a general class of 
Dirichlet series, now referred to as the "Selberg class". Selberg showed that his 
theory, originally devised for the Riemann zeta-function, carries over to the Selberg 
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class with remarkably few changes. To learn more about this subject, I recommend 
first reading Selberg's 1992 paper. Second, Kai-man Tsang (Selberg's only Ph.D. 
student) wrote a thesis @U| m 1984 which contains full details of the proofs for 
some of Selberg's more recent work on log£(s). Also, the two papers of A. Ghosh 
[131 114} refine some of Selberg's work from the 1940's. 

As examples, we state two of Selberg's results proved in Tsang's thesis. Selberg 
has developed methods for evaluating 



f F(\og((a + it))dt, 
Jo 



for functions F(z) such as sgn(Re(z)), sgn(Im(z)), |Re(z)|, and |Im(z)|. Let Xa,p{ u ) 
be 1 if a < u < [3 and zero otherwise. Then for a < [3 

We have similar results for the real and imaginary parts of log£(<7 + it). 

For the second result, let Z(T) denote the number of sign changes of S(t) in [0, T}. 
Selberg proved Z(T) > T(logT)i- £ on RH in [SB], and unconditionally (and with 
an improvement on the e) in j^H]. Ghosh J2j improved this to Z(T) ^> T(logT) 1_£ . 7 
Tsang's thesis contains the following remarkable improvements on these results. For 
some c > 0, 

(10.2) Z(T) > TlogTe- c ( losloslogT > 2 
and 

(10.3) Z(T)«Tlo g T^p^. 

VloglogT 

If the analysis of an error term could be improved then one would obtain 

(10.4) Z(T) ~ T 

V7T log log T 

I will now describe some key ideas that went into Selberg's work on S{t). The 
very remarkable result that Selberg proved in 1946 is that all the even moments of 
S{t) can be computed unconditionally 36] . He proved this on intervals (T, T + H], 
where T a < H < T and a > h, but for simplicity we will consider the interval 
[0,T]. 



Theorem 8. For k > 1, we have 

2 k 



(10.5) 



and 



m+ l ^ sin(ilog P ) 



7T 



p<ri 



dt < fc T 



(10.6) £\S(t)\ 2k dt= J^T(log log T) k + O k (T(log log T) k ^ 



^Also obtained earlier but unpublished by Selberg 
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This last relation is the 2£;th moment of a Gaussian. Earlier Selberg (23 proved 
(110. 5JI assuming the RH, and also (|10.6(l on RH but with an error term 0/ c (T(loglogT) fc ~ 1 ). 
These results were a great advance over previous work, which had failed to even 
obtain an asymptotic formula for the second moment. From l|10.5fl we see that S(t) 
can be approximated well in L 2k norm by the imaginary part of a short Dirichlet 
series. This series is short enough so that its L 2k norm is determined by diagonal 
terms, and has the Gaussian property in 1)10. fijl . Thus S(t) has this property too. 

The proof of (|10.5|) and (|10.(jD is based on an approximate formula for S(t), which 
has its origin in Selberg's earlier paper |34) on primes in short intervals. There, he 
proved on RH that for a = \ + j-f™ and ^> 1, 



(10.7) 



2 1 logT 
T 2 



dt<^p T(logT) 



2 



Selberg's work was ahead of its time, since we now know that replacing the bound 
in (|10.7|) by an asymptotic formula is equivalent to the PCC |17| . 

Selberg first found an approximate formula for ^- (s) . This is not straightforward. 

For a > 1, we have the Dirichlet series representation (|2.2|l for ^-(s). As we bring s 
into the critical strip the Dirichlet series fails to converge. It is a familiar fact that an 
appropriate partial sum of a Dirichlet series will still provide a good approximation 
for the analytic continuation of the series. However, on or near the critical line we 
expect the poles from the zeros £(s) to dominate, as reflected in the partial-fraction 
formula, for s ^ p, t > 2, 

(10.8) = J2 (— + -)+ °( l0 S*)' 
Since 

(10.9) S(t) = iarg C Q + itj = ~ £ Im (^(a + it) J dt, 

it is (maybe) plausible that the Dirichlet scries part of S(t) will usually dominate. A 
candidate for an approximate formula is (|3.13|) which we can rewrite as, for x > 1, 
S ?1,8? p, s ? -2k, 

C, v^A(n) x 1 - 3 ^ x p- s ^x- 2n - s 



n<x 

In hindsight (jlO.lOfl looks even better, because 




1 ^— \ A(n) sm(t logrt) 
da = > =- 

7r JniO'sn 

n<x v 

which gives exactly the approximation in (|10.5|) from the terms where n is prime. 
(The prime powers will contribute an error term.) The problem here is that the 
sum over zeros does not converge absolutely, and consequently (|10.1Uf) has never 
been used successfully for this problem. Earlier work had smoothed this formula 
(or rather over-smoothed it), so that the correct approximation was lost. Selberg 
had the innovative idea that one only needs to smooth slightly in order to obtain 
absolute convergence in the sum over zeros. 
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Let 
(10.11) 



A x (n) = 



^ A iji) } Q g n 
Then, for x > 1, s ^ 1, s ^ p, s ^= —2k, 



A(n), for 1 < n < x, 

log «_ 



■, for x < n < x 2 . 



(10.12) 



n<x 2 



l 



(1 — s) 2 logx log a; 
1 



E 



r p-s 



r-2(p-s) 



P 

00 -2n-s _ -2(2n+s) 



(p-s) 



E 



log a; ^— i (2n + s) 2 

n— 1 v 7 



This formula is much easier to prove than (|10.1Uf> . Selberg next argues as follows. 
Assume RH, and suppose 4 < x < t 2 . Let 



(10.13) 



1 1 

2 logx 



which is at the transition from the region where the Dirichlet series dominates to 
the region where the zeros dominate. From (|10.12f> . we see that for a > <ti and 
some complex number lo with \u>\ < 1 



(10.14) ^-(a+it) 



0"! 



n<a; 2 



( ai _!)" + ( t _ 7 ) 2 



We also have, on taking the real part of l|10.8[) . 

Re C j(a + it) = 7 + O(logt), 

Thus, taking the real part of l|l(). 14|> with a = o\ gives for some — 1 < u>' < 1, 



E 



0(logt) = -Re J] 



A a (n) 



(crx-i) +(t- 7 ) 2 



Since 1 



> 1 — | > j, we conclude that 



E 



(ai-i) +(t- 7 ) 2 



= 



n<x 2 



+ 0(logt). 



Substituting this back into I|1U.14|) . we obtain 



(10.15) "-(a + it) = V ^ + o ( *1- V 



A a; (n) 



n<ic 2 



n<:r 2 



log* . 



Selberg next substitutes (|10.15fl into l|10.9fl for the integration range u\ < a < oo. 
For i < a < o"i , he uses (|10.8|) and (|10.15|) to show this range only contributes to 
the error terms. The conclusion is the following theorem, which is the primary tool 
for obtaining Theorem 8 assuming the RH. 
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Theorem 9. Assume the Riemann Hypothesis. For t>2,A<x<t 2 , and o~\ 
given in (|10.13|) . we have 



(10.16) S(t) 



E A x (n) shift log n) | ( j 



1 



n<x 2 



logn 



logx 



n < x 2 



A x (n) 



-0 



log a; 



How do you remove the RH from the above analysis? I think it takes great 
insight to even suspect that this can be done. Selberg makes a much more subtle 
choice for <j\. He defines 



(10.17) 

where 

(10.18) 



1 / 1 2 
o- x ,t = 7: + 2 max p-r,; 1, 

2 peA \ 2 logx J 



A 



P-\t-l\< 



,3/3- 



logm 



Thus, we move towards or away from the critical line depending on how far off the 
line nearby zeros lie. There is also an issue of convergence, and the explicit formula 
(|10.12|l needs to be replaced by a similar formula where the sum over zeros has a 
factor of (s — p) 3 in the denominator. Ultimately the contribution from zeros off 
the i-line is bounded by a density estimate proved in 
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